Skip to content

[WIP][experimental] multi turn chat benchmark#821

Draft
cquil11 wants to merge 78 commits intomainfrom
experimental/multi-turn-benchmark
Draft

[WIP][experimental] multi turn chat benchmark#821
cquil11 wants to merge 78 commits intomainfrom
experimental/multi-turn-benchmark

Conversation

@cquil11
Copy link
Collaborator

@cquil11 cquil11 commented Feb 27, 2026

No description provided.

Rohan138 and others added 30 commits January 26, 2026 17:15
* fix AITER flags for v0.14.0 release

* drop mi325 triton gemm env var

* Add changes to perf changelog
…wont be erroneous negative diff [skip-sweep] (#571)
* remove assign

* initial

* update perf

* fix perf changelog

* trigger test sweep

* trigger test sweep pt 2

* rebase for evals only

* Update perf-changelog.yaml

* remove newline

* update perf changelog

---------

Co-authored-by: Cam Quilici <cjquilici@gmail.com>
* b300 srt slurm

* update generated srtslurm yaml

Signed-off-by: jthomson04 <jothomson@nvidia.com>

* fix image

* add uv and sqsh file

* change partition

* change slurm account

* use regular srt

Signed-off-by: jthomson04 <jothomson@nvidia.com>

* update perf changelog

Signed-off-by: jthomson04 <jothomson@nvidia.com>

* fix runner

Signed-off-by: jthomson04 <jothomson@nvidia.com>

* correct account

Signed-off-by: jthomson04 <jothomson@nvidia.com>

* qos support

Signed-off-by: jthomson04 <jothomson@nvidia.com>

* fix get checkout

Signed-off-by: jthomson04 <jothomson@nvidia.com>

* update runner label and partition

* undo branch checkout

Signed-off-by: jthomson04 <jothomson@nvidia.com>

* debug info

Signed-off-by: jthomson04 <jothomson@nvidia.com>

* cleanup logging

Signed-off-by: jthomson04 <jothomson@nvidia.com>

* use local model dir

Signed-off-by: jthomson04 <jothomson@nvidia.com>

* checkout specific commit

Signed-off-by: jthomson04 <jothomson@nvidia.com>

---------

Signed-off-by: jthomson04 <jothomson@nvidia.com>
Co-authored-by: Sahithi Chigurupati <schigurupati@nvidia.com>
Co-authored-by: Sahithi Chigurupati <chigurupati.sahithi@gmail.com>
…wont be erroneous negative diff [skip-sweep] (#577)
* Update SGLang Docker Image for MI355 to v0.5.8

1. activate FP8 KV cache
2. use the MLA persistent kernel

* Do not activate FP8 KV cache and the MLA persistent kernel explicitly

* Add config-keys (v0.5.5.post3 --> v0.5.8)

* Update perf-changelog.yaml with key fix description for v0.5.8

Add description: Disables mla persistent kernel when not using fp8 kv_cache

Co-authored-by: functionstackx <functionstackx@users.noreply.github.com>

---------

Co-authored-by: claude[bot] <41898282+claude[bot]@users.noreply.github.com>
Co-authored-by: functionstackx <functionstackx@users.noreply.github.com>
Co-authored-by: functionstackx <47992694+functionstackx@users.noreply.github.com>
30s default to 300s
* chore: save server long as artifact after single node runs

* test flaky eval

* test flaky eval

* test flaky eval

* rebase

* rebase pt 2

* add trap to upload server logs on exit

* rebase pt 3

* make server log in gha workspace

* export result filename at runtime so it is present

* revert perf changelog
* chore: add pre-merge check for newline in perf-changelog.yaml

Add a validation step in run-sweep.yml that ensures perf-changelog.yaml
ends with a newline character. This prevents negative diff issues in
subsequent PRs when the file is appended to.

Closes #578

Co-authored-by: Cameron Quilici <cquil11@users.noreply.github.com>

* test

* change logic of newline check

* trigger test check

* remove test perf changelog

---------

Co-authored-by: claude[bot] <41898282+claude[bot]@users.noreply.github.com>
Co-authored-by: Cameron Quilici <cquil11@users.noreply.github.com>
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
@functionstackx functionstackx changed the title [WIP][experimental] multi turn benchmark [WIP][experimental] multi turn chat benchmark Feb 28, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

Status: No status

Development

Successfully merging this pull request may close these issues.

5 participants